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IN THE CLAIMS 

The following is a listing of claims of the present application: 

L (Currently Amended) A method of processing audio-based data associated with a 
particular language, the method comprising the steps of: 
storing the audio-based data; 

generating a textual representation of the audio-based data, the textual representation being 
in the form of one or more semantic units corresponding to the audio-based data, wherein a each of 
at least a portion of the one or more semantic units comprises a minimal uni t of language having a 
seman t ic meaning a sub-unit of a word and not a complete word itself : and 

indexing the one or more semantic units and storing the one or more indexed semantic units 
for use in searching the stored audio-based data in response to a user query, wherein at least one 
segment of the stored audio-based data is retrievable by obtaining a location indicative of where the 
at least one segment is stored from a direct correspondence between at least one of the indexed 
semantic units and the at least one segment. 

2. (Original) The method of claim 1, wherein the semantic unit is a syllable. 

3. (Original) The method of claim 2, wherein the syllable is a phonetically based syllable. 

4. (Original) The method of claim 1, wherein the semantic unit is a morpheme. 

5. (Original) The method of claim 1, wherein the generating step comprises decoding the 
audio-based data in accordance with a speech recognition system. 

6. (Original) The method of claim 5, wherein the speech recognition system employs a 
semantic unit based language model. 



7. (Original) The method of claim 1, wherein the indexing step comprises time stamping the 
one or more semantic units. 
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8. (Original) The method of claim 1, wherein the searching step comprises: 

processing the user query to generate one or more semantic units representing the 
information that the user seeks to retrieve; 

searching the one or more indexed semantic units to find a substantial match with the one 
or more semantic units associated with the user query; and 

retrieving one or more segments of the audio-based data using the one or more indexed 
semantic units that match the one or more semantic units associated with the user query. 



9. (Original) The method of claim 8, wherein the searching step further comprises presenting 
the retrieved data to the user. 



10. (Original) The method of claim 1, wherein the particular language is an Asian based 
language. 



11. (Original) The method of claim 10, wherein the particular language is Chinese. 



12. (Original) The method of claim 1 1, wherein the semantic unit is a Chinese character. 



13. (Original) The method of claim 1, wherein the particular language is a Slavic based 
language. 



14. (Original) The method of claim 1, wherein the one or more semantic units are indexed 
according to speaker attributes. 

15. (Original) The method of claim 1, wherein the one or more semantic units are indexed 
according to at least one of when the audio based data was produced and where the audio based data 
was produced. 
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1 6. (Original) The method of claim 1 , further comprising the step of storing video based data 
associated with the audio based data for use in searching the stored audio based data and the video 
based data in response to a user query. 

17. (Original) The method of claim 16, wherein the searching step includes a hierarchical 
search routine. 

18. (Original) The method of claim 1, wherein the generating step comprises 
stenographically transcribing the audio-based data to generate the textual representation. 

19. (Currently Amended) Apparatus for processing audio-based data associated with a 
particular language, the apparatus comprising: 

a memory; and 

at least one processor coupled to the memory and operative to: (i)store the audio-based data 
in the memory; (ii) generate a textual representation of the audio-based data, the textual 
representation being in the form of one or more semantic units corresponding to the audio-based 
data, wherein a each of at least a portion of the one or more semantic units comprises a minimal unit 
of languag e having a s e mantic m e aning a sub-unit of a word and not a complete word itself : and (iii) 
index the one or more semantic units and store the one or more indexed semantic units for use in 
searching the stored audio-based data in response to a user query, wherein at least one segment of 
the stored audio-based data is retrievable by obtaining a location indicative of where the at least one 
segment is stored from a direct correspondence between at least one of the indexed semantic units 
and the at least one segment. 

20. (Currently Amended) An audio-based data indexing and retrieval system for processing 
audio-based data associated with a particular language, the system comprising: 

memory for storing the audio-based data; 

a semantic unit based speech recognition system for generating a textual representation of 
the audio-based data, the textual representation being in the form of one or more semantic units 
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corresponding to the audio-based data, wherein a each of at least a portion of the one or more 
semantic units comprises a minimal unit of language having a seman t ic meaning a sub-unit of a 
word and not a complete word itself : 

an indexing and storage module, operatively coupled to the semantic unit based speech 
recognition system and the memory, for indexing the one or more semantic units and storing the one 
or more indexed semantic units; and 

a search engine, operatively coupled to the indexing and storage module and the memory, 
for searching the one or more indexed semantic units for a match with one or more semantic units 
associated with a user query, and for retrieving the stored audio based data based on the one or more 
indexed semantic units, wherein at least one segment of the stored audio-based data is retrievable 
by obtaining a location indicative of where the at least one segment is stored from a direct 
correspondence between at least one of the indexed semantic units and the at least one segment. 

2 1 . (Previously Presented) The method of claim 5, wherein the speech recognition system 
employs a syllable language model. 

22. (Previously Presented) The method of claim 21, wherein production of the syllable 
language model comprises the steps of: 

transcribing audio data to generate syllables; 

deriving conditional probabilities of distribution based on the generated syllables; and 
using syllable counts and the conditional probabilities to construct the syllable language 

model. 

23 . (Previously Presented) The method of claim 1 , wherein the user query comprises a word. 

24. (Previously Presented) The method of claim 23, wherein the searching step further 
comprises transforming the word into a sequence of syllables using a text-to-phonetic syllable map. 
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25. (Previously Presented) The method of claim 3, wherein a phonetically-based syllable 
comprises a toneme. 

26. (Previously Presented) The method of claim 3, wherein two or more different 
pronunciations are associated with a phonetically-based syllable. 

27. (Previously Presented) The method of claim 1, wherein the generating step comprises 
producing the textual representation via stenography. 

28. (Previously Presented) The method of claim 1 , wherein the searching step comprises use 
of a hierarchical index. 

29. (Previously Presented) The method of claim 1 , wherein the searching step comprises use 
of an automatic boundary marking system. 
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